# must go first
%matplotlib inline
%config InlineBackend.figure_format='retina'
# plotting
import matplotlib as mpl
from matplotlib import pyplot as plt
import seaborn as sns
sns.set_context("poster", font_scale=1.3)
import folium
# system packages
import os, sys
import warnings
warnings.filterwarnings('ignore')
# basic wrangling
import numpy as np
import pandas as pd
# eda tools
import pivottablejs
import missingno as msno
import pandas_profiling
# File with functions from prior notebook(s)
sys.path.append('../scripts/')
from aqua_helper import time_slice, country_slice, time_series, simple_regions, subregion, variable_slice
# Update matplotlib defaults to something nicer
mpl_update = {'font.size':16,
'xtick.labelsize':14,
'ytick.labelsize':14,
'figure.figsize':[12.0,8.0],
'axes.color_cycle':['#0055A7', '#2C3E4F', '#26C5ED', '#00cc66', '#D34100', '#FF9700','#091D32'],
'axes.labelsize':20,
'axes.labelcolor':'#677385',
'axes.titlesize':20,
'lines.color':'#0055A7',
'lines.linewidth':3,
'text.color':'#677385'}
mpl.rcParams.update(mpl_update)
Exploratory data analysis consists of the following major tasks, which we present linearly here because each task doesn't make much sense to do without the ones prior to it. However, in reality, you are going to constantly jump around from step to step. You may want to do all the steps for a subset of the variables first. Or often, an observation will bring up a question you want to investigate and you'll branch off and explore to answer that question before returning down the main path of exhaustive EDA.
Throughout the entire analysis you want to:
Write questions that results raise as you go. Keep updating list of hypotheses
data = pd.read_csv('../data/aquastat/aquastat.csv.gzip', compression='gzip')
data[['variable','variable_full']].drop_duplicates()
| variable | variable_full | |
|---|---|---|
| 0 | total_area | Total area of the country (1000 ha) |
| 576 | arable_land | Arable land area (1000 ha) |
| 1152 | permanent_crop_area | Permanent crops area (1000 ha) |
| 1728 | cultivated_area | Cultivated area (arable land + permanent crops... |
| 2304 | percent_cultivated | % of total country area cultivated (%) |
| 2880 | total_pop | Total population (1000 inhab) |
| 3456 | rural_pop | Rural population (1000 inhab) |
| 4032 | urban_pop | Urban population (1000 inhab) |
| 4608 | gdp | Gross Domestic Product (GDP) (current US$) |
| 5184 | gdp_per_capita | GDP per capita (current US$/inhab) |
| 5760 | agg_to_gdp | Agriculture, value added to GDP (%) |
| 6336 | human_dev_index | Human Development Index (HDI) [highest = 1] (-) |
| 6912 | gender_inequal_index | Gender Inequality Index (GII) [equality = 0; i... |
| 7488 | percent_undernourished | Prevalence of undernourishment (3-year average... |
| 8064 | number_undernourished | Number of people undernourished (3-year averag... |
| 8640 | avg_annual_rain_depth | Long-term average annual precipitation in dept... |
| 9216 | avg_annual_rain_vol | Long-term average annual precipitation in volu... |
| 9792 | national_rainfall_index | National Rainfall Index (NRI) (mm/year) |
| 10368 | surface_water_produced | Surface water produced internally (10^9 m3/year) |
| 10944 | groundwater_produced | Groundwater produced internally (10^9 m3/year) |
| 11520 | surface_groundwater_overlap | Overlap between surface water and groundwater ... |
| 12096 | irwr | Total internal renewable water resources (IRWR... |
| 12672 | irwr_per_capita | Total internal renewable water resources per c... |
| 13248 | surface_entering | Surface water: entering the country (total) (1... |
| 13824 | surface_inflow_submit_no_treaty | Surface water: inflow not submitted to treatie... |
| 14400 | surface_inflow_submit_treaty | Surface water: inflow submitted to treaties (1... |
| 14976 | surface_inflow_secure_treaty | Surface water: inflow secured through treaties... |
| 15552 | total_flow_border_rivers | Surface water: total flow of border rivers (10... |
| 16128 | accounted_flow_border_rivers | Surface water: accounted flow of border rivers... |
| 16704 | accounted_flow | Surface water: accounted inflow (10^9 m3/year) |
| 17280 | surface_to_other_countries | Surface water: leaving the country to other co... |
| 17856 | surface_outflow_submit_no_treaty | Surface water: outflow to other countries not ... |
| 18432 | surface_outflow_submit_treaty | Surface water: outflow to other countries subm... |
| 19008 | surface_outflow_secure_treaty | Surface water: outflow to other countries secu... |
| 19584 | surface_total_external_renewable | Surface water: total external renewable (10^9 ... |
| 20160 | groundwater_entering | Groundwater: entering the country (total) (10^... |
| 20736 | groundwater_accounted_inflow | Groundwater: accounted inflow (10^9 m3/year) |
| 21312 | groundwater_to_other_countries | Groundwater: leaving the country to other coun... |
| 21888 | groundwater_accounted_outflow | Groundwater: accounted outflow to other countr... |
| 22464 | water_total_external_renewable | Water resources: total external renewable (10^... |
| 23040 | total_renewable_surface | Total renewable surface water (10^9 m3/year) |
| 23616 | total_renewable_groundwater | Total renewable groundwater (10^9 m3/year) |
| 24192 | overlap_surface_groundwater | Overlap: between surface water and groundwater... |
| 24768 | total_renewable | Total renewable water resources (10^9 m3/year) |
| 25344 | dependency_ratio | Dependency ratio (%) |
| 25920 | total_renewable_per_capita | Total renewable water resources per capita (m3... |
| 26496 | exploitable_regular_renewable_surface | Exploitable: regular renewable surface water (... |
| 27072 | exploitable_irregular_renewable_surface | Exploitable: irregular renewable surface water... |
| 27648 | exploitable_total_renewable_surface | Exploitable: total renewable surface water (10... |
| 28224 | exploitable_regular_renewable_groundwater | Exploitable: regular renewable groundwater (10... |
| 28800 | exploitable_total | Total exploitable water resources (10^9 m3/year) |
| 29376 | interannual_variability | Interannual variability (WRI) (-) |
| 29952 | seasonal_variability | Seasonal variability (WRI) (-) |
| 30528 | total_dam_capacity | Total dam capacity (km3) |
| 31104 | dam_capacity_per_capita | Dam capacity per capita (m3/inhab) |
| 31680 | irrigation_potential | Irrigation potential (1000 ha) |
| 32256 | flood_occurence | Flood occurrence (WRI) (-) |
| 32832 | total_pop_access_drinking | Total population with access to safe drinking-... |
| 33408 | rural_pop_access_drinking | Rural population with access to safe drinking-... |
| 33984 | urban_pop_access_drinking | Urban population with access to safe drinking-... |
Simplify regions
data.region = data.region.apply(lambda x: simple_regions[x])
Before trying to understand what information is in the data, make sure you understand what the data represents and what's missing.
What data isn’t there?
Questions to be considering
Package that provides a number of functions for visualizing what data is missing within a dataset: missingno
recent = time_slice(data, '2013-2017')
msno.bar(recent, labels=True)
Discussion: What questions does this figure bring up?
Add these to your list of questions!
msno.matrix(recent, labels=True)
Discussion: What additional information does this provide or what additional questions does it suggest?
"Exploitable" variables are missing for most countries.
Question to consider: Does this happen in each time period?
msno.matrix(variable_slice(data, 'exploitable_total'), inline=False, sort='descending');
plt.xlabel('Time period');
plt.ylabel('Country');
plt.title('Missing total exploitable water resources data across countries and time periods \n \n \n \n');
Total exploitable water resources is only reported on for a fraction of the countries and only a very small fraction of those countries have data for the most recent time period. Either a) data has not been reported yet and it will be at some point or b) most countries have stopped reporting on this factor or c) we do not have the domain knowledge to understand what's happening.
We are going to remove exploitable variables for future analysis because such few data points can cause a lot of problems.
data = data.loc[~data.variable.str.contains('exploitable'),:]
msno.matrix(variable_slice(data, 'national_rainfall_index'),
inline=False, sort='descending');
plt.xlabel('Time period');
plt.ylabel('Country');
plt.title('Missing national rainfall index data across countries and time periods \n \n \n \n');
National rainfall index is no longer reported on after 2002.
data = data.loc[~(data.variable=='national_rainfall_index')]
Let's look at North America only.
north_america = subregion(data, 'North America')
msno.bar(msno.nullity_sort(time_slice(north_america, '2013-2017'), sort='descending').T, inline=False)
plt.title('Fraction of fields complete by country for North America \n \n');
Question: Is there any pattern in the countries with most missing data?
Question: What are potential reasons for missing data? What can we check?
folium.Map(location=[18.1160128,-77.8364762], tiles="CartoDB positron",
zoom_start=5, width=1200, height=600)
Spot check what data is missing for the Bahamas to get more granular understanding.
msno.nullity_filter(country_slice(data, 'Bahamas').T, filter='bottom', p=0.1)
| Bahamas | dam_capacity_per_capita | flood_occurence | gender_inequal_index | groundwater_produced | interannual_variability | irrigation_potential | number_undernourished | overlap_surface_groundwater | percent_undernourished | seasonal_variability | surface_groundwater_overlap | surface_water_produced | total_dam_capacity | total_renewable_groundwater | total_renewable_surface |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time_period | |||||||||||||||
| 1958-1962 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1963-1967 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1968-1972 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1973-1977 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1978-1982 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1983-1987 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1988-1992 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1993-1997 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1998-2002 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2003-2007 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2008-2012 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2013-2017 | NaN | NaN | 0.2979 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
To do: Choose another region to assess for missing data.
# JSON with coordinates for country boundaries
geo = r'../data/aquastat/world.json'
null_data = recent['agg_to_gdp'].notnull()*1
map = folium.Map(location=[48, -102], zoom_start=2)
map.choropleth(geo_path=geo,
data=null_data,
columns=['country', 'agg_to_gdp'],
key_on='feature.properties.name', reset=True,
fill_color='GnBu', fill_opacity=1, line_opacity=0.2,
legend_name='Missing agricultural contribution to GDP data 2013-2017')
map
Question: What does the pale pale green mean? Compared to the green? (E.g. Greenland versus Canada)
Now let's functionalize so we can look at other variables geospatially.
def plot_null_map(df, time_period, variable,
legend_name=None):
geo = r'../data/aquastat/world.json'
ts = time_slice(df, time_period).reset_index().copy()
ts[variable]=ts[variable].notnull()*1
map = folium.Map(location=[48, -102], zoom_start=2)
map.choropleth(geo_path=geo,
data=ts,
columns=['country', variable],
key_on='feature.properties.name', reset=True,
fill_color='GnBu', fill_opacity=1, line_opacity=0.2,
legend_name=legend_name if legend_name else variable)
return map
plot_null_map(data, '2013-2017', 'number_undernourished', 'Number undernourished is missing')
Question: Are there any patterns in missing data? Any questions that come to mind for further investigation?
To do: Look at other variables
fig, ax = plt.subplots(figsize=(16, 16));
sns.heatmap(data.groupby(['time_period','variable']).value.count().unstack().T , ax=ax);
plt.xticks(rotation=45);
plt.xlabel('Time period');
plt.ylabel('Variable');
plt.title('Number of countries with data reported for each variable over time');
Before trying to understand what information is in the data, make sure you understand what the data represents.
Sanity check! Do the values make sense?
Things to do:
Questions to consider:
This stage really morphs into the univariate exploration that comes next as you are often diving into each variable one by one and first understanding it, exploring it, then checking that understanding again. We can however do some initial profiling with a few handy python packages.
pivottablejs¶pivottablejs.pivot_ui(time_slice(data, '2013-2017'),)
pandas_profiling¶pandas_profiling.ProfileReport(time_slice(data, '2013-2017'))
Dataset info
| Number of variables | 55 |
|---|---|
| Number of observations | 199 |
| Total Missing (%) | 6.4% |
| Total size in memory | 85.6 KiB |
| Average record size in memory | 440.4 B |
Variables types
| Numeric | 30 |
|---|---|
| Categorical | 0 |
| Date | 0 |
| Text (Unique) | 1 |
| Rejected | 24 |
Warnings
accounted_flow has 71 / 35.7% zerosaccounted_flow has 7 / 3.5% missing values Missingaccounted_flow_border_rivers has 150 / 75.4% zerosaccounted_flow_border_rivers has 7 / 3.5% missing values Missingagg_to_gdp has 32 / 16.1% missing values Missingarable_land has 3 / 1.5% zerosarable_land has 3 / 1.5% missing values Missingavg_annual_rain_depth has 18 / 9.0% missing values Missingavg_annual_rain_vol has 16 / 8.0% missing values Missingcultivated_area is highly correlated with arable_land (ρ = 0.99585) Rejecteddam_capacity_per_capita has 75 / 37.7% missing values Missingdependency_ratio has 68 / 34.2% zerosdependency_ratio has 7 / 3.5% missing values Missingflood_occurence has 7 / 3.5% zerosflood_occurence has 23 / 11.6% missing values Missinggdp has 11 / 5.5% missing values Missinggdp_per_capita has 11 / 5.5% missing values Missinggender_inequal_index has 43 / 21.6% missing values Missinggroundwater_accounted_inflow has 178 / 89.4% zerosgroundwater_accounted_inflow has 7 / 3.5% missing values Missinggroundwater_accounted_outflow has 140 / 70.4% zerosgroundwater_accounted_outflow has 45 / 22.6% missing values Missinggroundwater_entering has 179 / 89.9% zerosgroundwater_entering has 7 / 3.5% missing values Missinggroundwater_produced has 2 / 1.0% zerosgroundwater_produced has 29 / 14.6% missing values Missinggroundwater_to_other_countries is highly correlated with groundwater_accounted_outflow (ρ = 1) Rejectedhuman_dev_index has 12 / 6.0% missing values Missinginterannual_variability has 33 / 16.6% missing values Missingirrigation_potential has 88 / 44.2% missing values Missingirwr is highly correlated with avg_annual_rain_vol (ρ = 0.96167) Rejectedirwr_per_capita has 18 / 9.0% missing values Missingnumber_undernourished is highly correlated with irrigation_potential (ρ = 0.96711) Rejectedoverlap_surface_groundwater is highly correlated with groundwater_produced (ρ = 0.9919) Rejectedpercent_cultivated has 3 / 1.5% missing values Missingpercent_undernourished has 116 / 58.3% missing values Missingpermanent_crop_area has 6 / 3.0% zerospermanent_crop_area has 3 / 1.5% missing values Missingrural_pop is highly correlated with number_undernourished (ρ = 0.99226) Rejectedrural_pop_access_drinking has 16 / 8.0% missing values Missingseasonal_variability has 33 / 16.6% missing values Missingsurface_entering is highly correlated with accounted_flow (ρ = 0.98177) Rejectedsurface_groundwater_overlap is highly correlated with overlap_surface_groundwater (ρ = 1) Rejectedsurface_inflow_secure_treaty has 178 / 89.4% zerossurface_inflow_secure_treaty has 7 / 3.5% missing values Missingsurface_inflow_submit_no_treaty is highly correlated with surface_entering (ρ = 0.99629) Rejectedsurface_inflow_submit_treaty is highly correlated with surface_inflow_secure_treaty (ρ = 0.979) Rejectedsurface_outflow_secure_treaty has 179 / 89.9% zerossurface_outflow_secure_treaty has 5 / 2.5% missing values Missingsurface_outflow_submit_no_treaty has 78 / 39.2% zerossurface_outflow_submit_no_treaty has 16 / 8.0% missing values Missingsurface_outflow_submit_treaty is highly correlated with surface_outflow_secure_treaty (ρ = 0.97841) Rejectedsurface_to_other_countries is highly correlated with surface_outflow_submit_no_treaty (ρ = 0.99643) Rejectedsurface_total_external_renewable is highly correlated with surface_inflow_submit_no_treaty (ρ = 0.97923) Rejectedsurface_water_produced is highly correlated with irwr (ρ = 0.99953) Rejectedtotal_area has 2 / 1.0% missing values Missingtotal_dam_capacity is highly correlated with number_undernourished (ρ = 0.90598) Rejectedtotal_flow_border_rivers is highly correlated with accounted_flow_border_rivers (ρ = 0.96631) Rejectedtotal_pop is highly correlated with rural_pop (ρ = 0.96084) Rejectedtotal_pop_access_drinking is highly correlated with rural_pop_access_drinking (ρ = 0.94921) Rejectedtotal_renewable is highly correlated with surface_water_produced (ρ = 0.97515) Rejectedtotal_renewable_groundwater is highly correlated with surface_groundwater_overlap (ρ = 0.99191) Rejectedtotal_renewable_per_capita is highly correlated with irwr_per_capita (ρ = 0.97641) Rejectedtotal_renewable_surface is highly correlated with total_renewable (ρ = 0.99966) Rejectedurban_pop is highly correlated with total_pop (ρ = 0.95137) Rejectedurban_pop_access_drinking has 10 / 5.0% missing values Missingwater_total_external_renewable is highly correlated with surface_total_external_renewable (ρ = 1) Rejected accounted_flow
Numeric
| Distinct count | 113 |
|---|---|
| Unique (%) | 58.9% |
| Missing (%) | 3.5% |
| Missing (n) | 7 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 63.557 |
|---|---|
| Minimum | 0 |
| Maximum | 2986 |
| Zeros (%) | 35.7% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 3 |
| Q3 | 26.065 |
| 95-th percentile | 270.63 |
| Maximum | 2986 |
| Range | 2986 |
| Interquartile range | 26.065 |
Descriptive statistics
| Standard deviation | 250.32 |
|---|---|
| Coef of variation | 3.9386 |
| Kurtosis | 99.577 |
| Mean | 63.557 |
| MAD | 93.89 |
| Skewness | 9.0735 |
| Sum | 12203 |
| Variance | 62662 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 71 | 35.7% |
|
| 3.0 | 3 | 1.5% |
|
| 2.0 | 3 | 1.5% |
|
| 11.0 | 3 | 1.5% |
|
| 80.0 | 2 | 1.0% |
|
| 10.15 | 2 | 1.0% |
|
| 0.3 | 2 | 1.0% |
|
| 1.0 | 2 | 1.0% |
|
| 53.32 | 1 | 0.5% |
|
| 524.7 | 1 | 0.5% |
|
| Other values (102) | 102 | 51.3% |
|
| (Missing) | 7 | 3.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 71 | 35.7% |
|
| 0.015 | 1 | 0.5% |
|
| 0.038 | 1 | 0.5% |
|
| 0.096 | 1 | 0.5% |
|
| 0.165 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 584.2 | 1 | 0.5% |
|
| 610.0 | 1 | 0.5% |
|
| 635.2 | 1 | 0.5% |
|
| 1122.0 | 1 | 0.5% |
|
| 2986.0 | 1 | 0.5% |
|
accounted_flow_border_rivers
Numeric
| Distinct count | 41 |
|---|---|
| Unique (%) | 21.4% |
| Missing (%) | 3.5% |
| Missing (n) | 7 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 8.3438 |
|---|---|
| Minimum | 0 |
| Maximum | 558 |
| Zeros (%) | 75.4% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 29.198 |
| Maximum | 558 |
| Range | 558 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 46.811 |
|---|---|
| Coef of variation | 5.6103 |
| Kurtosis | 103.31 |
| Mean | 8.3438 |
| MAD | 14.415 |
| Skewness | 9.4653 |
| Sum | 1602 |
| Variance | 2191.3 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 150 | 75.4% |
|
| 10.15 | 2 | 1.0% |
|
| 1.45 | 2 | 1.0% |
|
| 11.0 | 2 | 1.0% |
|
| 3.815 | 1 | 0.5% |
|
| 7.74 | 1 | 0.5% |
|
| 34.33 | 1 | 0.5% |
|
| 25.0 | 1 | 0.5% |
|
| 75.0 | 1 | 0.5% |
|
| 1.25 | 1 | 0.5% |
|
| Other values (30) | 30 | 15.1% |
|
| (Missing) | 7 | 3.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 150 | 75.4% |
|
| 0.035 | 1 | 0.5% |
|
| 0.038 | 1 | 0.5% |
|
| 0.14 | 1 | 0.5% |
|
| 0.432 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 84.05 | 1 | 0.5% |
|
| 110.0 | 1 | 0.5% |
|
| 197.5 | 1 | 0.5% |
|
| 214.1 | 1 | 0.5% |
|
| 558.0 | 1 | 0.5% |
|
agg_to_gdp
Numeric
| Distinct count | 165 |
|---|---|
| Unique (%) | 98.8% |
| Missing (%) | 16.1% |
| Missing (n) | 32 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 12.545 |
|---|---|
| Minimum | 0.0349 |
| Maximum | 59.23 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0349 |
|---|---|
| 5-th percentile | 0.62919 |
| Q1 | 2.8885 |
| Median | 8.498 |
| Q3 | 19.465 |
| 95-th percentile | 36.168 |
| Maximum | 59.23 |
| Range | 59.195 |
| Interquartile range | 16.576 |
Descriptive statistics
| Standard deviation | 11.997 |
|---|---|
| Coef of variation | 0.95628 |
| Kurtosis | 1.5798 |
| Mean | 12.545 |
| MAD | 9.5593 |
| Skewness | 1.3303 |
| Sum | 2095 |
| Variance | 143.92 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 40.97 | 2 | 1.0% |
|
| 2.384 | 2 | 1.0% |
|
| 32.94 | 2 | 1.0% |
|
| 1.652 | 1 | 0.5% |
|
| 10.12 | 1 | 0.5% |
|
| 4.725 | 1 | 0.5% |
|
| 23.66 | 1 | 0.5% |
|
| 24.09 | 1 | 0.5% |
|
| 6.574 | 1 | 0.5% |
|
| 17.39 | 1 | 0.5% |
|
| Other values (154) | 154 | 77.4% |
|
| (Missing) | 32 | 16.1% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0349 | 1 | 0.5% |
|
| 0.1363 | 1 | 0.5% |
|
| 0.1815 | 1 | 0.5% |
|
| 0.3004 | 1 | 0.5% |
|
| 0.4083 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 42.89 | 1 | 0.5% |
|
| 43.92 | 1 | 0.5% |
|
| 47.52 | 1 | 0.5% |
|
| 52.39 | 1 | 0.5% |
|
| 59.23 | 1 | 0.5% |
|
arable_land
Numeric
| Distinct count | 173 |
|---|---|
| Unique (%) | 88.3% |
| Missing (%) | 1.5% |
| Missing (n) | 3 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 7229.3 |
|---|---|
| Minimum | 0 |
| Maximum | 156360 |
| Zeros (%) | 1.5% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1.9 |
| Q1 | 120 |
| Median | 1204.5 |
| Q3 | 4671.5 |
| 95-th percentile | 30963 |
| Maximum | 156360 |
| Range | 156360 |
| Interquartile range | 4551.5 |
Descriptive statistics
| Standard deviation | 21029 |
|---|---|
| Coef of variation | 2.9089 |
| Kurtosis | 31.537 |
| Mean | 7229.3 |
| MAD | 9460.8 |
| Skewness | 5.3616 |
| Sum | 1416900 |
| Variance | 442230000 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 2.0 | 4 | 2.0% |
|
| 1.0 | 4 | 2.0% |
|
| 3.0 | 3 | 1.5% |
|
| 3800.0 | 3 | 1.5% |
|
| 0.0 | 3 | 1.5% |
|
| 5.0 | 3 | 1.5% |
|
| 2350.0 | 2 | 1.0% |
|
| 120.0 | 2 | 1.0% |
|
| 800.0 | 2 | 1.0% |
|
| 300.0 | 2 | 1.0% |
|
| Other values (162) | 168 | 84.4% |
|
| (Missing) | 3 | 1.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 3 | 1.5% |
|
| 0.08 | 1 | 0.5% |
|
| 0.56 | 1 | 0.5% |
|
| 1.0 | 4 | 2.0% |
|
| 1.6 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 80017.0 | 1 | 0.5% |
|
| 106298.0 | 1 | 0.5% |
|
| 123122.0 | 1 | 0.5% |
|
| 154605.0 | 1 | 0.5% |
|
| 156360.0 | 1 | 0.5% |
|
avg_annual_rain_depth
Numeric
| Distinct count | 174 |
|---|---|
| Unique (%) | 96.1% |
| Missing (%) | 9.0% |
| Missing (n) | 18 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 1166 |
|---|---|
| Minimum | 51 |
| Maximum | 3240 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 51 |
|---|---|
| 5-th percentile | 121 |
| Q1 | 562 |
| Median | 1030 |
| Q3 | 1705 |
| 95-th percentile | 2702 |
| Maximum | 3240 |
| Range | 3189 |
| Interquartile range | 1143 |
Descriptive statistics
| Standard deviation | 800.11 |
|---|---|
| Coef of variation | 0.68622 |
| Kurtosis | -0.40545 |
| Mean | 1166 |
| MAD | 662.11 |
| Skewness | 0.65639 |
| Sum | 211040 |
| Variance | 640180 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 250.0 | 2 | 1.0% |
|
| 788.0 | 2 | 1.0% |
|
| 900.0 | 2 | 1.0% |
|
| 1500.0 | 2 | 1.0% |
|
| 1274.0 | 2 | 1.0% |
|
| 282.0 | 2 | 1.0% |
|
| 2200.0 | 2 | 1.0% |
|
| 228.0 | 2 | 1.0% |
|
| 657.0 | 1 | 0.5% |
|
| 241.0 | 1 | 0.5% |
|
| Other values (163) | 163 | 81.9% |
|
| (Missing) | 18 | 9.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 51.0 | 1 | 0.5% |
|
| 56.0 | 1 | 0.5% |
|
| 59.0 | 1 | 0.5% |
|
| 74.0 | 1 | 0.5% |
|
| 78.0 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 2928.0 | 1 | 0.5% |
|
| 3028.0 | 1 | 0.5% |
|
| 3142.0 | 1 | 0.5% |
|
| 3200.0 | 1 | 0.5% |
|
| 3240.0 | 1 | 0.5% |
|
avg_annual_rain_vol
Numeric
| Distinct count | 181 |
|---|---|
| Unique (%) | 98.9% |
| Missing (%) | 8.0% |
| Missing (n) | 16 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 595.32 |
|---|---|
| Minimum | 0.064 |
| Maximum | 14995 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.064 |
|---|---|
| 5-th percentile | 0.86507 |
| Q1 | 33.095 |
| Median | 127.8 |
| Q3 | 434.5 |
| 95-th percentile | 3427.4 |
| Maximum | 14995 |
| Range | 14995 |
| Interquartile range | 401.4 |
Descriptive statistics
| Standard deviation | 1590.4 |
|---|---|
| Coef of variation | 2.6715 |
| Kurtosis | 41.211 |
| Mean | 595.32 |
| MAD | 734.91 |
| Skewness | 5.7112 |
| Sum | 108940 |
| Variance | 2529400 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 1259.0 | 2 | 1.0% |
|
| 220.8 | 2 | 1.0% |
|
| 297.2 | 2 | 1.0% |
|
| 279.2 | 1 | 0.5% |
|
| 3618.0 | 1 | 0.5% |
|
| 199.8 | 1 | 0.5% |
|
| 25.86 | 1 | 0.5% |
|
| 1415.0 | 1 | 0.5% |
|
| 7030.0 | 1 | 0.5% |
|
| 513.1 | 1 | 0.5% |
|
| Other values (170) | 170 | 85.4% |
|
| (Missing) | 16 | 8.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.064 | 1 | 0.5% |
|
| 0.1792 | 1 | 0.5% |
|
| 0.371 | 1 | 0.5% |
|
| 0.4532 | 1 | 0.5% |
|
| 0.4724 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 5362.0 | 1 | 0.5% |
|
| 6192.0 | 1 | 0.5% |
|
| 7030.0 | 1 | 0.5% |
|
| 7865.0 | 1 | 0.5% |
|
| 14995.0 | 1 | 0.5% |
|
country
Categorical, Unique
| First 3 values |
|---|
| Papua New Guinea |
| Bolivia (Plurinational State of) |
| Malaysia |
| Last 3 values |
|---|
| Maldives |
| Marshall Islands |
| Iraq |
First 10 values
| Value | Count | Frequency (%) | |
| Afghanistan | 1 | 0.5% |
|
| Albania | 1 | 0.5% |
|
| Algeria | 1 | 0.5% |
|
| Andorra | 1 | 0.5% |
|
| Angola | 1 | 0.5% |
|
Last 10 values
| Value | Count | Frequency (%) | |
| Venezuela (Bolivarian Republic of) | 1 | 0.5% |
|
| Viet Nam | 1 | 0.5% |
|
| Yemen | 1 | 0.5% |
|
| Zambia | 1 | 0.5% |
|
| Zimbabwe | 1 | 0.5% |
|
cultivated_area
Highly correlated
This variable is highly correlated with arable_land and should be ignored for analysis
| Correlation | 0.99585 |
|---|
dam_capacity_per_capita
Numeric
| Distinct count | 124 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 37.7% |
| Missing (n) | 75 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 1580.3 |
|---|---|
| Minimum | 0.1873 |
| Maximum | 36832 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.1873 |
|---|---|
| 5-th percentile | 3.2488 |
| Q1 | 71.072 |
| Median | 328.5 |
| Q3 | 1205.5 |
| 95-th percentile | 5561.6 |
| Maximum | 36832 |
| Range | 36832 |
| Interquartile range | 1134.4 |
Descriptive statistics
| Standard deviation | 4130.5 |
|---|---|
| Coef of variation | 2.6138 |
| Kurtosis | 48.996 |
| Mean | 1580.3 |
| MAD | 1934 |
| Skewness | 6.4315 |
| Sum | 195950 |
| Variance | 17061000 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 633.1 | 2 | 1.0% |
|
| 65.35 | 1 | 0.5% |
|
| 3370.0 | 1 | 0.5% |
|
| 61.76 | 1 | 0.5% |
|
| 55.49 | 1 | 0.5% |
|
| 1321.0 | 1 | 0.5% |
|
| 4.704 | 1 | 0.5% |
|
| 1055.0 | 1 | 0.5% |
|
| 38.97 | 1 | 0.5% |
|
| 2050.0 | 1 | 0.5% |
|
| Other values (113) | 113 | 56.8% |
|
| (Missing) | 75 | 37.7% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.1873 | 1 | 0.5% |
|
| 0.6846 | 1 | 0.5% |
|
| 1.948 | 1 | 0.5% |
|
| 1.969 | 1 | 0.5% |
|
| 2.16 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 6386.0 | 1 | 0.5% |
|
| 6405.0 | 1 | 0.5% |
|
| 7001.0 | 1 | 0.5% |
|
| 23414.0 | 1 | 0.5% |
|
| 36832.0 | 1 | 0.5% |
|
dependency_ratio
Numeric
| Distinct count | 125 |
|---|---|
| Unique (%) | 65.1% |
| Missing (%) | 3.5% |
| Missing (n) | 7 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 22.819 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros (%) | 34.2% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 6.183 |
| Q3 | 40.8 |
| 95-th percentile | 87.299 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range | 40.8 |
Descriptive statistics
| Standard deviation | 29.869 |
|---|---|
| Coef of variation | 1.309 |
| Kurtosis | 0.010321 |
| Mean | 22.819 |
| MAD | 25.123 |
| Skewness | 1.1509 |
| Sum | 4381.2 |
| Variance | 892.16 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 68 | 34.2% |
|
| 30.52 | 2 | 1.0% |
|
| 80.39 | 1 | 0.5% |
|
| 4.123 | 1 | 0.5% |
|
| 14.63 | 1 | 0.5% |
|
| 7.407 | 1 | 0.5% |
|
| 24.49 | 1 | 0.5% |
|
| 21.77 | 1 | 0.5% |
|
| 5.769 | 1 | 0.5% |
|
| 64.27 | 1 | 0.5% |
|
| Other values (114) | 114 | 57.3% |
|
| (Missing) | 7 | 3.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 68 | 34.2% |
|
| 0.2691 | 1 | 0.5% |
|
| 0.2695 | 1 | 0.5% |
|
| 0.7496 | 1 | 0.5% |
|
| 0.7854 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 96.49 | 1 | 0.5% |
|
| 96.55 | 1 | 0.5% |
|
| 96.91 | 1 | 0.5% |
|
| 97.0 | 1 | 0.5% |
|
| 100.0 | 1 | 0.5% |
|
flood_occurence
Numeric
| Distinct count | 41 |
|---|---|
| Unique (%) | 23.3% |
| Missing (%) | 11.6% |
| Missing (n) | 23 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 2.6955 |
|---|---|
| Minimum | 0 |
| Maximum | 4.9 |
| Zeros (%) | 3.5% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.35 |
| Q1 | 2.3 |
| Median | 2.9 |
| Q3 | 3.325 |
| 95-th percentile | 3.825 |
| Maximum | 4.9 |
| Range | 4.9 |
| Interquartile range | 1.025 |
Descriptive statistics
| Standard deviation | 0.99495 |
|---|---|
| Coef of variation | 0.36912 |
| Kurtosis | 0.96957 |
| Mean | 2.6955 |
| MAD | 0.75212 |
| Skewness | -1.0132 |
| Sum | 474.4 |
| Variance | 0.98992 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 3.6 | 12 | 6.0% |
|
| 3.0 | 11 | 5.5% |
|
| 3.3 | 11 | 5.5% |
|
| 3.1 | 11 | 5.5% |
|
| 2.5 | 10 | 5.0% |
|
| 2.9 | 8 | 4.0% |
|
| 3.5 | 8 | 4.0% |
|
| 2.8 | 8 | 4.0% |
|
| 2.7 | 7 | 3.5% |
|
| 0.0 | 7 | 3.5% |
|
| Other values (30) | 83 | 41.7% |
|
| (Missing) | 23 | 11.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 7 | 3.5% |
|
| 0.1 | 1 | 0.5% |
|
| 0.2 | 1 | 0.5% |
|
| 0.4 | 1 | 0.5% |
|
| 0.6 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 3.9 | 5 | 2.5% |
|
| 4.0 | 1 | 0.5% |
|
| 4.5 | 1 | 0.5% |
|
| 4.7 | 1 | 0.5% |
|
| 4.9 | 1 | 0.5% |
|
gdp
Numeric
| Distinct count | 185 |
|---|---|
| Unique (%) | 98.4% |
| Missing (%) | 5.5% |
| Missing (n) | 11 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 384890000000 |
|---|---|
| Minimum | 37860000 |
| Maximum | 17900000000000 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 37860000 |
|---|---|
| 5-th percentile | 754760000 |
| Q1 | 7675800000 |
| Median | 30475000000 |
| Q3 | 192500000000 |
| 95-th percentile | 1490500000000 |
| Maximum | 17900000000000 |
| Range | 17900000000000 |
| Interquartile range | 184820000000 |
Descriptive statistics
| Standard deviation | 1604200000000 |
|---|---|
| Coef of variation | 4.1681 |
| Kurtosis | 85.899 |
| Mean | 384890000000 |
| MAD | 549640000000 |
| Skewness | 8.7022 |
| Sum | 72359000000000 |
| Variance | 2.5736e+24 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 195000000000.0 | 2 | 1.0% |
|
| 167000000000.0 | 2 | 1.0% |
|
| 296000000000.0 | 2 | 1.0% |
|
| 292000000000.0 | 2 | 1.0% |
|
| 2.85e+12 | 1 | 0.5% |
|
| 13779570706.0 | 1 | 0.5% |
|
| 35237742278.0 | 1 | 0.5% |
|
| 199000000000.0 | 1 | 0.5% |
|
| 52132289700.0 | 1 | 0.5% |
|
| 11099473097.0 | 1 | 0.5% |
|
| Other values (174) | 174 | 87.4% |
|
| (Missing) | 11 | 5.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 37859554.0 | 1 | 0.5% |
|
| 145237022.0 | 1 | 0.5% |
|
| 186716626.0 | 1 | 0.5% |
|
| 287400000.0 | 1 | 0.5% |
|
| 318071979.0 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 2.85e+12 | 1 | 0.5% |
|
| 3.36e+12 | 1 | 0.5% |
|
| 4.12e+12 | 1 | 0.5% |
|
| 1.09e+13 | 1 | 0.5% |
|
| 1.79e+13 | 1 | 0.5% |
|
gdp_per_capita
Numeric
| Distinct count | 188 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 5.5% |
| Missing (n) | 11 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 12531 |
|---|---|
| Minimum | 276 |
| Maximum | 101910 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 276 |
|---|---|
| 5-th percentile | 537.11 |
| Q1 | 1732.2 |
| Median | 4911.5 |
| Q3 | 14474 |
| 95-th percentile | 50644 |
| Maximum | 101910 |
| Range | 101640 |
| Interquartile range | 12742 |
Descriptive statistics
| Standard deviation | 17543 |
|---|---|
| Coef of variation | 1.4 |
| Kurtosis | 5.4979 |
| Mean | 12531 |
| MAD | 12391 |
| Skewness | 2.2522 |
| Sum | 2355700 |
| Variance | 307760000 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 3974.0 | 2 | 1.0% |
|
| 26018.0 | 1 | 0.5% |
|
| 1773.0 | 1 | 0.5% |
|
| 3822.0 | 1 | 0.5% |
|
| 6862.0 | 1 | 0.5% |
|
| 1429.0 | 1 | 0.5% |
|
| 23030.0 | 1 | 0.5% |
|
| 17282.0 | 1 | 0.5% |
|
| 1579.0 | 1 | 0.5% |
|
| 3491.0 | 1 | 0.5% |
|
| Other values (177) | 177 | 88.9% |
|
| (Missing) | 11 | 5.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 276.0 | 1 | 0.5% |
|
| 306.8 | 1 | 0.5% |
|
| 359.0 | 1 | 0.5% |
|
| 381.4 | 1 | 0.5% |
|
| 411.8 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 55906.0 | 1 | 0.5% |
|
| 74458.0 | 1 | 0.5% |
|
| 74720.0 | 1 | 0.5% |
|
| 80130.0 | 1 | 0.5% |
|
| 101911.0 | 1 | 0.5% |
|
gender_inequal_index
Numeric
| Distinct count | 155 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 21.6% |
| Missing (n) | 43 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.36695 |
|---|---|
| Minimum | 0.0164 |
| Maximum | 0.744 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0164 |
|---|---|
| 5-th percentile | 0.066375 |
| Q1 | 0.18738 |
| Median | 0.38605 |
| Q3 | 0.52572 |
| 95-th percentile | 0.65758 |
| Maximum | 0.744 |
| Range | 0.7276 |
| Interquartile range | 0.33835 |
Descriptive statistics
| Standard deviation | 0.19133 |
|---|---|
| Coef of variation | 0.52141 |
| Kurtosis | -1.1218 |
| Mean | 0.36695 |
| MAD | 0.16375 |
| Skewness | -0.10515 |
| Sum | 57.244 |
| Variance | 0.036608 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.1247 | 2 | 1.0% |
|
| 0.1636 | 2 | 1.0% |
|
| 0.1507 | 1 | 0.5% |
|
| 0.3573 | 1 | 0.5% |
|
| 0.4485 | 1 | 0.5% |
|
| 0.5329 | 1 | 0.5% |
|
| 0.6224 | 1 | 0.5% |
|
| 0.0884 | 1 | 0.5% |
|
| 0.4796 | 1 | 0.5% |
|
| 0.4134 | 1 | 0.5% |
|
| Other values (144) | 144 | 72.4% |
|
| (Missing) | 43 | 21.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0164 | 1 | 0.5% |
|
| 0.0278 | 1 | 0.5% |
|
| 0.0407 | 1 | 0.5% |
|
| 0.0484 | 1 | 0.5% |
|
| 0.0528 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.6789 | 1 | 0.5% |
|
| 0.6934 | 1 | 0.5% |
|
| 0.7065 | 1 | 0.5% |
|
| 0.7132 | 1 | 0.5% |
|
| 0.744 | 1 | 0.5% |
|
groundwater_accounted_inflow
Numeric
| Distinct count | 15 |
|---|---|
| Unique (%) | 7.8% |
| Missing (%) | 3.5% |
| Missing (n) | 7 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.012557 |
|---|---|
| Minimum | -1.2 |
| Maximum | 1.33 |
| Zeros (%) | 89.4% |
Quantile statistics
| Minimum | -1.2 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.0245 |
| Maximum | 1.33 |
| Range | 2.53 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 0.1577 |
|---|---|
| Coef of variation | 12.559 |
| Kurtosis | 52.707 |
| Mean | 0.012557 |
| MAD | 0.036051 |
| Skewness | 2.4695 |
| Sum | 2.411 |
| Variance | 0.02487 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 178 | 89.4% |
|
| 0.08 | 2 | 1.0% |
|
| 0.01 | 1 | 0.5% |
|
| 0.032 | 1 | 0.5% |
|
| -1.2 | 1 | 0.5% |
|
| 0.725 | 1 | 0.5% |
|
| 0.002 | 1 | 0.5% |
|
| 1.33 | 1 | 0.5% |
|
| 0.03 | 1 | 0.5% |
|
| 0.02 | 1 | 0.5% |
|
| Other values (4) | 4 | 2.0% |
|
| (Missing) | 7 | 3.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| -1.2 | 1 | 0.5% |
|
| 0.0 | 178 | 89.4% |
|
| 0.002 | 1 | 0.5% |
|
| 0.01 | 1 | 0.5% |
|
| 0.02 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.1 | 1 | 0.5% |
|
| 0.112 | 1 | 0.5% |
|
| 0.725 | 1 | 0.5% |
|
| 1.0 | 1 | 0.5% |
|
| 1.33 | 1 | 0.5% |
|
groundwater_accounted_outflow
Numeric
| Distinct count | 15 |
|---|---|
| Unique (%) | 9.7% |
| Missing (%) | 22.6% |
| Missing (n) | 45 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.27273 |
|---|---|
| Minimum | 0 |
| Maximum | 26.12 |
| Zeros (%) | 70.4% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.301 |
| Maximum | 26.12 |
| Range | 26.12 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 2.2802 |
|---|---|
| Coef of variation | 8.3604 |
| Kurtosis | 112.51 |
| Mean | 0.27273 |
| MAD | 0.51012 |
| Skewness | 10.334 |
| Sum | 42.001 |
| Variance | 5.1991 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 140 | 70.4% |
|
| 0.95 | 2 | 1.0% |
|
| 0.032 | 1 | 0.5% |
|
| 0.03 | 1 | 0.5% |
|
| 0.34 | 1 | 0.5% |
|
| 0.025 | 1 | 0.5% |
|
| 0.1 | 1 | 0.5% |
|
| 0.7 | 1 | 0.5% |
|
| 26.12 | 1 | 0.5% |
|
| 0.394 | 1 | 0.5% |
|
| Other values (4) | 4 | 2.0% |
|
| (Missing) | 45 | 22.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 140 | 70.4% |
|
| 0.025 | 1 | 0.5% |
|
| 0.03 | 1 | 0.5% |
|
| 0.032 | 1 | 0.5% |
|
| 0.08 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.7 | 1 | 0.5% |
|
| 0.95 | 2 | 1.0% |
|
| 1.0 | 1 | 0.5% |
|
| 11.0 | 1 | 0.5% |
|
| 26.12 | 1 | 0.5% |
|
groundwater_entering
Numeric
| Distinct count | 14 |
|---|---|
| Unique (%) | 7.3% |
| Missing (%) | 3.5% |
| Missing (n) | 7 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.070786 |
|---|---|
| Minimum | 0 |
| Maximum | 11.13 |
| Zeros (%) | 89.9% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.0245 |
| Maximum | 11.13 |
| Range | 11.13 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 0.80753 |
|---|---|
| Coef of variation | 11.408 |
| Kurtosis | 187.02 |
| Mean | 0.070786 |
| MAD | 0.13469 |
| Skewness | 13.6 |
| Sum | 13.591 |
| Variance | 0.6521 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 179 | 89.9% |
|
| 0.08 | 2 | 1.0% |
|
| 0.01 | 1 | 0.5% |
|
| 0.032 | 1 | 0.5% |
|
| 0.725 | 1 | 0.5% |
|
| 0.002 | 1 | 0.5% |
|
| 0.27 | 1 | 0.5% |
|
| 0.03 | 1 | 0.5% |
|
| 0.02 | 1 | 0.5% |
|
| 0.112 | 1 | 0.5% |
|
| Other values (3) | 3 | 1.5% |
|
| (Missing) | 7 | 3.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 179 | 89.9% |
|
| 0.002 | 1 | 0.5% |
|
| 0.01 | 1 | 0.5% |
|
| 0.02 | 1 | 0.5% |
|
| 0.03 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.112 | 1 | 0.5% |
|
| 0.27 | 1 | 0.5% |
|
| 0.725 | 1 | 0.5% |
|
| 1.0 | 1 | 0.5% |
|
| 11.13 | 1 | 0.5% |
|
groundwater_produced
Numeric
| Distinct count | 150 |
|---|---|
| Unique (%) | 88.2% |
| Missing (%) | 14.6% |
| Missing (n) | 29 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 62.768 |
|---|---|
| Minimum | 0 |
| Maximum | 1383 |
| Zeros (%) | 1.0% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.0767 |
| Q1 | 2.2 |
| Median | 9.65 |
| Q3 | 40.98 |
| 95-th percentile | 398.05 |
| Maximum | 1383 |
| Range | 1383 |
| Interquartile range | 38.78 |
Descriptive statistics
| Standard deviation | 165.13 |
|---|---|
| Coef of variation | 2.6308 |
| Kurtosis | 29.206 |
| Mean | 62.768 |
| MAD | 82.406 |
| Skewness | 4.8804 |
| Sum | 10670 |
| Variance | 27268 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 6.0 | 4 | 2.0% |
|
| 20.0 | 4 | 2.0% |
|
| 0.5 | 4 | 2.0% |
|
| 2.5 | 3 | 1.5% |
|
| 1.3 | 3 | 1.5% |
|
| 4.0 | 3 | 1.5% |
|
| 3.2 | 2 | 1.0% |
|
| 55.0 | 2 | 1.0% |
|
| 2.2 | 2 | 1.0% |
|
| 10.0 | 2 | 1.0% |
|
| Other values (139) | 141 | 70.9% |
|
| (Missing) | 29 | 14.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 2 | 1.0% |
|
| 0.01 | 1 | 0.5% |
|
| 0.015 | 1 | 0.5% |
|
| 0.02 | 1 | 0.5% |
|
| 0.03 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 510.0 | 1 | 0.5% |
|
| 645.6 | 1 | 0.5% |
|
| 788.0 | 1 | 0.5% |
|
| 828.8 | 1 | 0.5% |
|
| 1383.0 | 1 | 0.5% |
|
groundwater_to_other_countries
Highly correlated
This variable is highly correlated with groundwater_accounted_outflow and should be ignored for analysis
| Correlation | 1 |
|---|
human_dev_index
Numeric
| Distinct count | 186 |
|---|---|
| Unique (%) | 99.5% |
| Missing (%) | 6.0% |
| Missing (n) | 12 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.69128 |
|---|---|
| Minimum | 0.3483 |
| Maximum | 0.9439 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.3483 |
|---|---|
| 5-th percentile | 0.41939 |
| Q1 | 0.57255 |
| Median | 0.7238 |
| Q3 | 0.80925 |
| 95-th percentile | 0.91278 |
| Maximum | 0.9439 |
| Range | 0.5956 |
| Interquartile range | 0.2367 |
Descriptive statistics
| Standard deviation | 0.15429 |
|---|---|
| Coef of variation | 0.2232 |
| Kurtosis | -0.91053 |
| Mean | 0.69128 |
| MAD | 0.12987 |
| Skewness | -0.35983 |
| Sum | 129.27 |
| Variance | 0.023807 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.715 | 2 | 1.0% |
|
| 0.7928 | 2 | 1.0% |
|
| 0.514 | 1 | 0.5% |
|
| 0.6087 | 1 | 0.5% |
|
| 0.6276 | 1 | 0.5% |
|
| 0.575 | 1 | 0.5% |
|
| 0.8701 | 1 | 0.5% |
|
| 0.8175 | 1 | 0.5% |
|
| 0.6357 | 1 | 0.5% |
|
| 0.9155 | 1 | 0.5% |
|
| Other values (175) | 175 | 87.9% |
|
| (Missing) | 12 | 6.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.3483 | 1 | 0.5% |
|
| 0.3501 | 1 | 0.5% |
|
| 0.3909 | 1 | 0.5% |
|
| 0.3919 | 1 | 0.5% |
|
| 0.3999 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.9218 | 1 | 0.5% |
|
| 0.9233 | 1 | 0.5% |
|
| 0.9296 | 1 | 0.5% |
|
| 0.935 | 1 | 0.5% |
|
| 0.9439 | 1 | 0.5% |
|
interannual_variability
Numeric
| Distinct count | 33 |
|---|---|
| Unique (%) | 19.9% |
| Missing (%) | 16.6% |
| Missing (n) | 33 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 1.7584 |
|---|---|
| Minimum | 0.6 |
| Maximum | 4.9 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.6 |
|---|---|
| 5-th percentile | 0.8 |
| Q1 | 1.1 |
| Median | 1.5 |
| Q3 | 2.3 |
| 95-th percentile | 3.5 |
| Maximum | 4.9 |
| Range | 4.3 |
| Interquartile range | 1.2 |
Descriptive statistics
| Standard deviation | 0.88408 |
|---|---|
| Coef of variation | 0.50276 |
| Kurtosis | 0.88868 |
| Mean | 1.7584 |
| MAD | 0.71277 |
| Skewness | 1.1336 |
| Sum | 291.9 |
| Variance | 0.7816 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 1.0 | 15 | 7.5% |
|
| 1.2 | 12 | 6.0% |
|
| 0.9 | 11 | 5.5% |
|
| 1.4 | 11 | 5.5% |
|
| 1.5 | 10 | 5.0% |
|
| 1.1 | 10 | 5.0% |
|
| 1.3 | 9 | 4.5% |
|
| 0.8 | 8 | 4.0% |
|
| 2.7 | 7 | 3.5% |
|
| 2.3 | 6 | 3.0% |
|
| Other values (22) | 67 | 33.7% |
|
| (Missing) | 33 | 16.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.6 | 3 | 1.5% |
|
| 0.7 | 2 | 1.0% |
|
| 0.8 | 8 | 4.0% |
|
| 0.9 | 11 | 5.5% |
|
| 1.0 | 15 | 7.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 3.6 | 2 | 1.0% |
|
| 3.8 | 1 | 0.5% |
|
| 4.2 | 2 | 1.0% |
|
| 4.3 | 2 | 1.0% |
|
| 4.9 | 1 | 0.5% |
|
irrigation_potential
Numeric
| Distinct count | 105 |
|---|---|
| Unique (%) | 94.6% |
| Missing (%) | 44.2% |
| Missing (n) | 88 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 4638.7 |
|---|---|
| Minimum | 0.2 |
| Maximum | 139500 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.2 |
|---|---|
| 5-th percentile | 7.465 |
| Q1 | 183.5 |
| Median | 566 |
| Q3 | 3099 |
| 95-th percentile | 15500 |
| Maximum | 139500 |
| Range | 139500 |
| Interquartile range | 2915.5 |
Descriptive statistics
| Standard deviation | 15300 |
|---|---|
| Coef of variation | 3.2984 |
| Kurtosis | 58.212 |
| Mean | 4638.7 |
| MAD | 6008.1 |
| Skewness | 7.1447 |
| Sum | 514900 |
| Variance | 234100000 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 2700.0 | 2 | 1.0% |
|
| 5500.0 | 2 | 1.0% |
|
| 165.0 | 2 | 1.0% |
|
| 600.0 | 2 | 1.0% |
|
| 1900.0 | 2 | 1.0% |
|
| 30.0 | 2 | 1.0% |
|
| 200.0 | 2 | 1.0% |
|
| 70000.0 | 1 | 0.5% |
|
| 40.0 | 1 | 0.5% |
|
| 0.894 | 1 | 0.5% |
|
| Other values (94) | 94 | 47.2% |
|
| (Missing) | 88 | 44.2% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.2 | 1 | 0.5% |
|
| 0.3 | 1 | 0.5% |
|
| 0.894 | 1 | 0.5% |
|
| 1.0 | 1 | 0.5% |
|
| 2.4 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 21300.0 | 1 | 0.5% |
|
| 29000.0 | 1 | 0.5% |
|
| 29350.0 | 1 | 0.5% |
|
| 70000.0 | 1 | 0.5% |
|
| 139500.0 | 1 | 0.5% |
|
irwr
Highly correlated
This variable is highly correlated with avg_annual_rain_vol and should be ignored for analysis
| Correlation | 0.96167 |
|---|
irwr_per_capita
Numeric
| Distinct count | 182 |
|---|---|
| Unique (%) | 100.6% |
| Missing (%) | 9.0% |
| Missing (n) | 18 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 16036 |
|---|---|
| Minimum | 0 |
| Maximum | 516090 |
| Zeros (%) | 0.5% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 93.01 |
| Q1 | 913.2 |
| Median | 2599 |
| Q3 | 11227 |
| 95-th percentile | 72201 |
| Maximum | 516090 |
| Range | 516090 |
| Interquartile range | 10314 |
Descriptive statistics
| Standard deviation | 49232 |
|---|---|
| Coef of variation | 3.07 |
| Kurtosis | 66.768 |
| Mean | 16036 |
| MAD | 20476 |
| Skewness | 7.4684 |
| Sum | 2902600 |
| Variance | 2423700000 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 822.2 | 1 | 0.5% |
|
| 566.3 | 1 | 0.5% |
|
| 1571.0 | 1 | 0.5% |
|
| 11761.0 | 1 | 0.5% |
|
| 19444.0 | 1 | 0.5% |
|
| 2886.0 | 1 | 0.5% |
|
| 3303.0 | 1 | 0.5% |
|
| 1213.0 | 1 | 0.5% |
|
| 5372.0 | 1 | 0.5% |
|
| 3585.0 | 1 | 0.5% |
|
| Other values (171) | 171 | 85.9% |
|
| (Missing) | 18 | 9.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 1 | 0.5% |
|
| 2.905 | 1 | 0.5% |
|
| 16.38 | 1 | 0.5% |
|
| 19.67 | 1 | 0.5% |
|
| 25.06 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 100671.0 | 1 | 0.5% |
|
| 105132.0 | 1 | 0.5% |
|
| 182320.0 | 1 | 0.5% |
|
| 314170.0 | 1 | 0.5% |
|
| 516090.0 | 1 | 0.5% |
|
number_undernourished
Highly correlated
This variable is highly correlated with irrigation_potential and should be ignored for analysis
| Correlation | 0.96711 |
|---|
overlap_surface_groundwater
Highly correlated
This variable is highly correlated with groundwater_produced and should be ignored for analysis
| Correlation | 0.9919 |
|---|
percent_cultivated
Numeric
| Distinct count | 195 |
|---|---|
| Unique (%) | 99.5% |
| Missing (%) | 1.5% |
| Missing (n) | 3 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 18.513 |
|---|---|
| Minimum | 0.0862 |
| Maximum | 63.41 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0862 |
|---|---|
| 5-th percentile | 0.96727 |
| Q1 | 5.9638 |
| Median | 14.68 |
| Q3 | 27.88 |
| 95-th percentile | 50.148 |
| Maximum | 63.41 |
| Range | 63.324 |
| Interquartile range | 21.916 |
Descriptive statistics
| Standard deviation | 15.496 |
|---|---|
| Coef of variation | 0.83705 |
| Kurtosis | 0.28068 |
| Mean | 18.513 |
| MAD | 12.504 |
| Skewness | 0.98211 |
| Sum | 3628.6 |
| Variance | 240.14 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 27.88 | 2 | 1.0% |
|
| 60.0 | 2 | 1.0% |
|
| 16.13 | 1 | 0.5% |
|
| 3.412 | 1 | 0.5% |
|
| 18.02 | 1 | 0.5% |
|
| 10.91 | 1 | 0.5% |
|
| 31.98 | 1 | 0.5% |
|
| 6.111 | 1 | 0.5% |
|
| 21.4 | 1 | 0.5% |
|
| 55.7 | 1 | 0.5% |
|
| Other values (184) | 184 | 92.5% |
|
| (Missing) | 3 | 1.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0862 | 1 | 0.5% |
|
| 0.2223 | 1 | 0.5% |
|
| 0.3658 | 1 | 0.5% |
|
| 0.4334 | 1 | 0.5% |
|
| 0.4473 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 56.76 | 1 | 0.5% |
|
| 57.57 | 1 | 0.5% |
|
| 60.0 | 2 | 1.0% |
|
| 62.3 | 1 | 0.5% |
|
| 63.41 | 1 | 0.5% |
|
percent_undernourished
Numeric
| Distinct count | 69 |
|---|---|
| Unique (%) | 83.1% |
| Missing (%) | 58.3% |
| Missing (n) | 116 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 17.223 |
|---|---|
| Minimum | 5.1 |
| Maximum | 53.4 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 5.1 |
|---|---|
| 5-th percentile | 5.5 |
| Q1 | 7.9 |
| Median | 13.5 |
| Q3 | 23.45 |
| 95-th percentile | 40.88 |
| Maximum | 53.4 |
| Range | 48.3 |
| Interquartile range | 15.55 |
Descriptive statistics
| Standard deviation | 11.416 |
|---|---|
| Coef of variation | 0.66285 |
| Kurtosis | 0.8264 |
| Mean | 17.223 |
| MAD | 9.2504 |
| Skewness | 1.1583 |
| Sum | 1429.5 |
| Variance | 130.33 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 7.4 | 3 | 1.5% |
|
| 14.2 | 3 | 1.5% |
|
| 20.7 | 3 | 1.5% |
|
| 7.5 | 2 | 1.0% |
|
| 9.5 | 2 | 1.0% |
|
| 16.4 | 2 | 1.0% |
|
| 26.8 | 2 | 1.0% |
|
| 6.2 | 2 | 1.0% |
|
| 22.0 | 2 | 1.0% |
|
| 15.9 | 2 | 1.0% |
|
| Other values (58) | 60 | 30.2% |
|
| (Missing) | 116 | 58.3% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 5.1 | 2 | 1.0% |
|
| 5.2 | 1 | 0.5% |
|
| 5.3 | 1 | 0.5% |
|
| 5.5 | 2 | 1.0% |
|
| 5.6 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 41.6 | 1 | 0.5% |
|
| 42.3 | 1 | 0.5% |
|
| 47.7 | 1 | 0.5% |
|
| 47.8 | 1 | 0.5% |
|
| 53.4 | 1 | 0.5% |
|
permanent_crop_area
Numeric
| Distinct count | 155 |
|---|---|
| Unique (%) | 79.1% |
| Missing (%) | 1.5% |
| Missing (n) | 3 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 839.45 |
|---|---|
| Minimum | 0 |
| Maximum | 22500 |
| Zeros (%) | 3.0% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.575 |
| Q1 | 14.35 |
| Median | 112 |
| Q3 | 455.5 |
| 95-th percentile | 4500 |
| Maximum | 22500 |
| Range | 22500 |
| Interquartile range | 441.15 |
Descriptive statistics
| Standard deviation | 2429.3 |
|---|---|
| Coef of variation | 2.8939 |
| Kurtosis | 42.919 |
| Mean | 839.45 |
| MAD | 1128.2 |
| Skewness | 5.9581 |
| Sum | 164530 |
| Variance | 5901600 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 6 | 3.0% |
|
| 4.0 | 5 | 2.5% |
|
| 3.0 | 4 | 2.0% |
|
| 6.0 | 3 | 1.5% |
|
| 2.0 | 3 | 1.5% |
|
| 700.0 | 3 | 1.5% |
|
| 100.0 | 3 | 1.5% |
|
| 5.0 | 3 | 1.5% |
|
| 1.0 | 3 | 1.5% |
|
| 60.0 | 2 | 1.0% |
|
| Other values (144) | 161 | 80.9% |
|
| (Missing) | 3 | 1.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 6 | 3.0% |
|
| 0.1 | 2 | 1.0% |
|
| 0.4 | 1 | 0.5% |
|
| 0.5 | 1 | 0.5% |
|
| 0.6 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 6572.0 | 1 | 0.5% |
|
| 6600.0 | 1 | 0.5% |
|
| 13000.0 | 1 | 0.5% |
|
| 16226.0 | 1 | 0.5% |
|
| 22500.0 | 1 | 0.5% |
|
rural_pop
Highly correlated
This variable is highly correlated with number_undernourished and should be ignored for analysis
| Correlation | 0.99226 |
|---|
rural_pop_access_drinking
Numeric
| Distinct count | 118 |
|---|---|
| Unique (%) | 64.5% |
| Missing (%) | 8.0% |
| Missing (n) | 16 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 84.026 |
|---|---|
| Minimum | 28.2 |
| Maximum | 100 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 28.2 |
|---|---|
| 5-th percentile | 45.65 |
| Q1 | 72.95 |
| Median | 92.1 |
| Q3 | 99.35 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 71.8 |
| Interquartile range | 26.4 |
Descriptive statistics
| Standard deviation | 18.907 |
|---|---|
| Coef of variation | 0.22501 |
| Kurtosis | 0.37805 |
| Mean | 84.026 |
| MAD | 15.52 |
| Skewness | -1.1836 |
| Sum | 15377 |
| Variance | 357.46 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 100.0 | 39 | 19.6% |
|
| 99.0 | 6 | 3.0% |
|
| 98.3 | 3 | 1.5% |
|
| 92.1 | 2 | 1.0% |
|
| 67.3 | 2 | 1.0% |
|
| 97.0 | 2 | 1.0% |
|
| 73.8 | 2 | 1.0% |
|
| 99.7 | 2 | 1.0% |
|
| 95.1 | 2 | 1.0% |
|
| 69.4 | 2 | 1.0% |
|
| Other values (107) | 121 | 60.8% |
|
| (Missing) | 16 | 8.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 28.2 | 1 | 0.5% |
|
| 31.2 | 1 | 0.5% |
|
| 31.5 | 1 | 0.5% |
|
| 32.8 | 1 | 0.5% |
|
| 35.3 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 99.6 | 1 | 0.5% |
|
| 99.7 | 2 | 1.0% |
|
| 99.8 | 1 | 0.5% |
|
| 99.9 | 1 | 0.5% |
|
| 100.0 | 39 | 19.6% |
|
seasonal_variability
Numeric
| Distinct count | 43 |
|---|---|
| Unique (%) | 25.9% |
| Missing (%) | 16.6% |
| Missing (n) | 33 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 2.2904 |
|---|---|
| Minimum | 0.3 |
| Maximum | 4.6 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.3 |
|---|---|
| 5-th percentile | 0.625 |
| Q1 | 1.525 |
| Median | 2.3 |
| Q3 | 3.1 |
| 95-th percentile | 3.875 |
| Maximum | 4.6 |
| Range | 4.3 |
| Interquartile range | 1.575 |
Descriptive statistics
| Standard deviation | 1.0288 |
|---|---|
| Coef of variation | 0.44917 |
| Kurtosis | -0.87948 |
| Mean | 2.2904 |
| MAD | 0.87 |
| Skewness | 0.079612 |
| Sum | 380.2 |
| Variance | 1.0583 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 2.5 | 8 | 4.0% |
|
| 3.6 | 8 | 4.0% |
|
| 2.1 | 8 | 4.0% |
|
| 1.6 | 8 | 4.0% |
|
| 1.9 | 7 | 3.5% |
|
| 3.1 | 7 | 3.5% |
|
| 3.5 | 7 | 3.5% |
|
| 2.4 | 7 | 3.5% |
|
| 1.0 | 6 | 3.0% |
|
| 1.8 | 6 | 3.0% |
|
| Other values (32) | 94 | 47.2% |
|
| (Missing) | 33 | 16.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.3 | 1 | 0.5% |
|
| 0.4 | 2 | 1.0% |
|
| 0.5 | 1 | 0.5% |
|
| 0.6 | 5 | 2.5% |
|
| 0.7 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 4.0 | 3 | 1.5% |
|
| 4.1 | 1 | 0.5% |
|
| 4.2 | 1 | 0.5% |
|
| 4.4 | 1 | 0.5% |
|
| 4.6 | 2 | 1.0% |
|
surface_entering
Highly correlated
This variable is highly correlated with accounted_flow and should be ignored for analysis
| Correlation | 0.98177 |
|---|
surface_groundwater_overlap
Highly correlated
This variable is highly correlated with overlap_surface_groundwater and should be ignored for analysis
| Correlation | 1 |
|---|
surface_inflow_secure_treaty
Numeric
| Distinct count | 16 |
|---|---|
| Unique (%) | 8.3% |
| Missing (%) | 3.5% |
| Missing (n) | 7 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 2.1905 |
|---|---|
| Minimum | 0 |
| Maximum | 170.3 |
| Zeros (%) | 89.4% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 2.6319 |
| Maximum | 170.3 |
| Range | 170.3 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 14.253 |
|---|---|
| Coef of variation | 6.5067 |
| Kurtosis | 105.16 |
| Mean | 2.1905 |
| MAD | 4.1016 |
| Skewness | 9.5926 |
| Sum | 420.57 |
| Variance | 203.14 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 178 | 89.4% |
|
| 2.208 | 1 | 0.5% |
|
| 3.15 | 1 | 0.5% |
|
| 65.65 | 1 | 0.5% |
|
| 44.11 | 1 | 0.5% |
|
| 170.3 | 1 | 0.5% |
|
| 0.82 | 1 | 0.5% |
|
| 0.05 | 1 | 0.5% |
|
| 1.85 | 1 | 0.5% |
|
| 16.09 | 1 | 0.5% |
|
| Other values (5) | 5 | 2.5% |
|
| (Missing) | 7 | 3.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 178 | 89.4% |
|
| 0.05 | 1 | 0.5% |
|
| 0.82 | 1 | 0.5% |
|
| 1.85 | 1 | 0.5% |
|
| 2.208 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 26.5 | 1 | 0.5% |
|
| 44.11 | 1 | 0.5% |
|
| 55.5 | 1 | 0.5% |
|
| 65.65 | 1 | 0.5% |
|
| 170.3 | 1 | 0.5% |
|
surface_inflow_submit_no_treaty
Highly correlated
This variable is highly correlated with surface_entering and should be ignored for analysis
| Correlation | 0.99629 |
|---|
surface_inflow_submit_treaty
Highly correlated
This variable is highly correlated with surface_inflow_secure_treaty and should be ignored for analysis
| Correlation | 0.979 |
|---|
surface_outflow_secure_treaty
Numeric
| Distinct count | 17 |
|---|---|
| Unique (%) | 8.8% |
| Missing (%) | 2.5% |
| Missing (n) | 5 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 2.22 |
|---|---|
| Minimum | 0 |
| Maximum | 170.3 |
| Zeros (%) | 89.9% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 1.3058 |
| Maximum | 170.3 |
| Range | 170.3 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 14.168 |
|---|---|
| Coef of variation | 6.382 |
| Kurtosis | 106.2 |
| Mean | 2.22 |
| MAD | 4.1863 |
| Skewness | 9.6002 |
| Sum | 430.69 |
| Variance | 200.74 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 179 | 89.9% |
|
| 2.208 | 1 | 0.5% |
|
| 54.86 | 1 | 0.5% |
|
| 0.79 | 1 | 0.5% |
|
| 0.82 | 1 | 0.5% |
|
| 0.05 | 1 | 0.5% |
|
| 170.3 | 1 | 0.5% |
|
| 18.9 | 1 | 0.5% |
|
| 25.87 | 1 | 0.5% |
|
| 0.432 | 1 | 0.5% |
|
| Other values (6) | 6 | 3.0% |
|
| (Missing) | 5 | 2.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 179 | 89.9% |
|
| 0.05 | 1 | 0.5% |
|
| 0.335 | 1 | 0.5% |
|
| 0.432 | 1 | 0.5% |
|
| 0.79 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 26.5 | 1 | 0.5% |
|
| 33.12 | 1 | 0.5% |
|
| 54.86 | 1 | 0.5% |
|
| 65.5 | 1 | 0.5% |
|
| 170.3 | 1 | 0.5% |
|
surface_outflow_submit_no_treaty
Numeric
| Distinct count | 105 |
|---|---|
| Unique (%) | 57.4% |
| Missing (%) | 8.0% |
| Missing (n) | 16 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 55.924 |
|---|---|
| Minimum | 0 |
| Maximum | 1868 |
| Zeros (%) | 39.2% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 1.725 |
| Q3 | 18.135 |
| 95-th percentile | 193.68 |
| Maximum | 1868 |
| Range | 1868 |
| Interquartile range | 18.135 |
Descriptive statistics
| Standard deviation | 208.77 |
|---|---|
| Coef of variation | 3.7332 |
| Kurtosis | 43.517 |
| Mean | 55.924 |
| MAD | 84.158 |
| Skewness | 6.2161 |
| Sum | 10234 |
| Variance | 43586 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 78 | 39.2% |
|
| 3.0 | 2 | 1.0% |
|
| 13.2 | 2 | 1.0% |
|
| 48.0 | 1 | 0.5% |
|
| 37.0 | 1 | 0.5% |
|
| 160.0 | 1 | 0.5% |
|
| 4.86 | 1 | 0.5% |
|
| 9.655 | 1 | 0.5% |
|
| 0.177 | 1 | 0.5% |
|
| 6.145 | 1 | 0.5% |
|
| Other values (94) | 94 | 47.2% |
|
| (Missing) | 16 | 8.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 78 | 39.2% |
|
| 0.015 | 1 | 0.5% |
|
| 0.017 | 1 | 0.5% |
|
| 0.057 | 1 | 0.5% |
|
| 0.096 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 585.7 | 1 | 0.5% |
|
| 718.8 | 1 | 0.5% |
|
| 1142.0 | 1 | 0.5% |
|
| 1375.0 | 1 | 0.5% |
|
| 1868.0 | 1 | 0.5% |
|
surface_outflow_submit_treaty
Highly correlated
This variable is highly correlated with surface_outflow_secure_treaty and should be ignored for analysis
| Correlation | 0.97841 |
|---|
surface_to_other_countries
Highly correlated
This variable is highly correlated with surface_outflow_submit_no_treaty and should be ignored for analysis
| Correlation | 0.99643 |
|---|
surface_total_external_renewable
Highly correlated
This variable is highly correlated with surface_inflow_submit_no_treaty and should be ignored for analysis
| Correlation | 0.97923 |
|---|
surface_water_produced
Highly correlated
This variable is highly correlated with irwr and should be ignored for analysis
| Correlation | 0.99953 |
|---|
total_area
Numeric
| Distinct count | 195 |
|---|---|
| Unique (%) | 99.0% |
| Missing (%) | 1.0% |
| Missing (n) | 2 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 67954 |
|---|---|
| Minimum | 1 |
| Maximum | 1709800 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 31.6 |
| Q1 | 2207 |
| Median | 11760 |
| Q3 | 51312 |
| 95-th percentile | 235220 |
| Maximum | 1709800 |
| Range | 1709800 |
| Interquartile range | 49105 |
Descriptive statistics
| Standard deviation | 190910 |
|---|---|
| Coef of variation | 2.8094 |
| Kurtosis | 36.522 |
| Mean | 67954 |
| MAD | 85134 |
| Skewness | 5.6078 |
| Sum | 13387000 |
| Variance | 36445000000 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 26.0 | 2 | 1.0% |
|
| 75.0 | 2 | 1.0% |
|
| 46.0 | 2 | 1.0% |
|
| 25637.0 | 1 | 0.5% |
|
| 60355.0 | 1 | 0.5% |
|
| 44655.0 | 1 | 0.5% |
|
| 2571.0 | 1 | 0.5% |
|
| 126700.0 | 1 | 0.5% |
|
| 54909.0 | 1 | 0.5% |
|
| 11137.0 | 1 | 0.5% |
|
| Other values (184) | 184 | 92.5% |
|
| (Missing) | 2 | 1.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 1.0 | 1 | 0.5% |
|
| 2.0 | 1 | 0.5% |
|
| 3.0 | 1 | 0.5% |
|
| 6.0 | 1 | 0.5% |
|
| 16.0 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 851577.0 | 1 | 0.5% |
|
| 960001.0 | 1 | 0.5% |
|
| 983151.0 | 1 | 0.5% |
|
| 998467.0 | 1 | 0.5% |
|
| 1709825.0 | 1 | 0.5% |
|
total_dam_capacity
Highly correlated
This variable is highly correlated with number_undernourished and should be ignored for analysis
| Correlation | 0.90598 |
|---|
total_flow_border_rivers
Highly correlated
This variable is highly correlated with accounted_flow_border_rivers and should be ignored for analysis
| Correlation | 0.96631 |
|---|
total_pop
Highly correlated
This variable is highly correlated with rural_pop and should be ignored for analysis
| Correlation | 0.96084 |
|---|
total_pop_access_drinking
Highly correlated
This variable is highly correlated with rural_pop_access_drinking and should be ignored for analysis
| Correlation | 0.94921 |
|---|
total_renewable
Highly correlated
This variable is highly correlated with surface_water_produced and should be ignored for analysis
| Correlation | 0.97515 |
|---|
total_renewable_groundwater
Highly correlated
This variable is highly correlated with surface_groundwater_overlap and should be ignored for analysis
| Correlation | 0.99191 |
|---|
total_renewable_per_capita
Highly correlated
This variable is highly correlated with irwr_per_capita and should be ignored for analysis
| Correlation | 0.97641 |
|---|
total_renewable_surface
Highly correlated
This variable is highly correlated with total_renewable and should be ignored for analysis
| Correlation | 0.99966 |
|---|
urban_pop
Highly correlated
This variable is highly correlated with total_pop and should be ignored for analysis
| Correlation | 0.95137 |
|---|
urban_pop_access_drinking
Numeric
| Distinct count | 88 |
|---|---|
| Unique (%) | 46.6% |
| Missing (%) | 5.0% |
| Missing (n) | 10 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 94.787 |
|---|---|
| Minimum | 50.7 |
| Maximum | 100 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 50.7 |
|---|---|
| 5-th percentile | 76.12 |
| Q1 | 93.8 |
| Median | 98.1 |
| Q3 | 99.9 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 49.3 |
| Interquartile range | 6.1 |
Descriptive statistics
| Standard deviation | 8.4156 |
|---|---|
| Coef of variation | 0.088784 |
| Kurtosis | 7.5084 |
| Mean | 94.787 |
| MAD | 5.5776 |
| Skewness | -2.6093 |
| Sum | 17915 |
| Variance | 70.822 |
| Memory size | 1.6 KiB |
| Value | Count | Frequency (%) | |
| 100.0 | 47 | 23.6% |
|
| 99.7 | 7 | 3.5% |
|
| 97.5 | 5 | 2.5% |
|
| 99.6 | 5 | 2.5% |
|
| 99.0 | 4 | 2.0% |
|
| 98.9 | 4 | 2.0% |
|
| 99.9 | 4 | 2.0% |
|
| 99.5 | 3 | 1.5% |
|
| 95.5 | 3 | 1.5% |
|
| 97.0 | 3 | 1.5% |
|
| Other values (77) | 104 | 52.3% |
|
| (Missing) | 10 | 5.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 50.7 | 1 | 0.5% |
|
| 58.4 | 1 | 0.5% |
|
| 64.9 | 1 | 0.5% |
|
| 66.0 | 1 | 0.5% |
|
| 66.4 | 1 | 0.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 99.6 | 5 | 2.5% |
|
| 99.7 | 7 | 3.5% |
|
| 99.8 | 2 | 1.0% |
|
| 99.9 | 4 | 2.0% |
|
| 100.0 | 47 | 23.6% |
|
water_total_external_renewable
Highly correlated
This variable is highly correlated with surface_total_external_renewable and should be ignored for analysis
| Correlation | 1 |
|---|
| 2013-2017 | accounted_flow | accounted_flow_border_rivers | agg_to_gdp | arable_land | avg_annual_rain_depth | avg_annual_rain_vol | cultivated_area | dam_capacity_per_capita | dependency_ratio | flood_occurence | gdp | gdp_per_capita | gender_inequal_index | groundwater_accounted_inflow | groundwater_accounted_outflow | groundwater_entering | groundwater_produced | groundwater_to_other_countries | human_dev_index | interannual_variability | irrigation_potential | irwr | irwr_per_capita | number_undernourished | overlap_surface_groundwater | percent_cultivated | percent_undernourished | permanent_crop_area | rural_pop | rural_pop_access_drinking | seasonal_variability | surface_entering | surface_groundwater_overlap | surface_inflow_secure_treaty | surface_inflow_submit_no_treaty | surface_inflow_submit_treaty | surface_outflow_secure_treaty | surface_outflow_submit_no_treaty | surface_outflow_submit_treaty | surface_to_other_countries | surface_total_external_renewable | surface_water_produced | total_area | total_dam_capacity | total_flow_border_rivers | total_pop | total_pop_access_drinking | total_renewable | total_renewable_groundwater | total_renewable_per_capita | total_renewable_surface | urban_pop | urban_pop_access_drinking | water_total_external_renewable |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| country | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Afghanistan | 19.00 | 9.0 | 22.6000 | 7771.0 | 327.0 | 213.5000 | 7910.0 | 61.76 | 28.7200 | 3.7 | 1.919944e+10 | 590.3 | 0.6934 | 0.00 | NaN | 0.00 | 10.650 | NaN | 0.4653 | 2.5 | NaN | 47.1500 | 1450.0 | 8600.0 | 1.00 | 12.120 | 26.8 | 139.0 | 23980.00 | 47.0 | 2.5 | 10.00 | 1.00 | 0.0 | 10.00 | 0.0 | 0.82 | 35.52 | 6.7 | 42.22 | 18.18 | 37.50 | 65286.0 | 2.009 | 33.4 | 32527.00 | 55.3 | 65.3300 | 10.650 | 2008.0 | 55.68 | 8547.0 | 78.2 | 18.18 |
| Albania | 3.30 | 0.0 | 22.0500 | 615.6 | 1485.0 | 42.6900 | 696.0 | 1391.00 | 10.9300 | 2.7 | 1.145560e+10 | 3954.0 | 0.2174 | 0.00 | 0.0 | 0.00 | 6.200 | 0.0 | 0.7328 | 1.2 | NaN | 26.9000 | 9285.0 | NaN | 2.35 | 24.210 | NaN | 80.4 | 1062.00 | 95.2 | 2.4 | 3.30 | 2.35 | 0.0 | 3.30 | 0.0 | 0.00 | 11.50 | 0.0 | 11.50 | 3.30 | 23.05 | 2875.0 | 4.030 | 0.0 | 2897.00 | 95.1 | 30.2000 | 6.200 | 10425.0 | 26.35 | 1835.0 | 94.9 | 3.30 |
| Algeria | 0.39 | 0.0 | 13.0500 | 7469.0 | 89.0 | 212.0000 | 8439.0 | 209.30 | 3.5990 | 2.8 | 1.670000e+11 | 4210.0 | 0.4131 | 0.03 | 0.1 | 0.03 | 1.487 | 0.1 | 0.7356 | 2.3 | 1300.0 | 11.2500 | 283.6 | NaN | 0.00 | 3.543 | NaN | 969.8 | 10928.00 | 81.8 | 1.9 | 0.39 | 0.00 | 0.0 | 0.39 | 0.0 | 0.00 | 0.32 | 0.0 | 0.32 | 0.39 | 9.76 | 238174.0 | 8.304 | 0.0 | 39667.00 | 83.6 | 11.6700 | 1.517 | 294.2 | 10.15 | 28739.0 | 84.3 | 0.42 |
| Andorra | NaN | NaN | 0.5239 | 2.8 | NaN | 0.4724 | 2.8 | NaN | NaN | 3.3 | 3.249101e+09 | 46106.0 | NaN | NaN | NaN | NaN | NaN | NaN | 0.8446 | 1.5 | NaN | 0.3156 | 4479.0 | NaN | NaN | 5.957 | NaN | 0.0 | 1.57 | 100.0 | 1.6 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 47.0 | NaN | NaN | 70.47 | 100.0 | 0.3156 | NaN | 4479.0 | NaN | 68.9 | 100.0 | NaN |
| Angola | 0.40 | 0.0 | NaN | 4900.0 | 1010.0 | 1259.0000 | 5190.0 | 377.50 | 0.2695 | 1.7 | 1.030000e+11 | 4116.0 | NaN | 0.00 | 0.0 | 0.00 | 58.000 | 0.0 | 0.5316 | 2.5 | 3700.0 | 148.0000 | 5915.0 | 3200.0 | 55.00 | 4.163 | 14.2 | 290.0 | 14970.00 | 28.2 | 3.1 | 0.40 | 55.00 | 0.0 | 0.40 | 0.0 | 0.00 | 122.80 | 0.0 | 122.80 | 0.40 | 145.00 | 124670.0 | 9.445 | 0.0 | 25022.00 | 49.0 | 148.4000 | 58.000 | 5931.0 | 145.40 | 10052.0 | 75.4 | 0.40 |